The optimization process of a H.264/AVC encoder on three different architectures is presented. The architectures are multiand\r\nsinglecore and SIMD instruction sets have different vector registers size. The need of code optimization is fundamental\r\nwhen addressing HD resolutions with real-time constraints. The encoder is subdivided in functional modules in order to better\r\nunderstand where the optimization is a key factor and to evaluate in details the performance improvement. Common issues in both\r\npartitioning a video encoder into parallel architectures and SIMD optimization are described, and author solutions are presented\r\nfor all the architectures. Besides showing efficient video encoder implementations, one of the main purposes of this paper is to\r\ndiscuss how the characteristics of different architectures and different set of SIMD instructions can impact on the target application\r\nperformance. Results about the achieved speedup are provided in order to compare the different implementations and evaluate\r\nthe more suitable solutions for present and next generation video-coding algorithms.
Loading....